智能论文笔记

Applying Eigencontours to PolarMask-Based Instance Segmentation

Wonhui Park , Dongkwon Jin , Chang-Su Kim

分类：计算机视觉

2022-08-24

Eigencontours是基于单数值分解的第一个数据驱动的轮廓描述符。根据ESE-SEG的实现，将Eigencontours应用于实例细分任务。在本报告中，我们将Eigencontours纳入Polarmask网络中，例如细分。实验结果表明，在COCO2017和SBD的两个实例分割数据集上，所提出的算法比Polarmask产生的结果更好。此外，我们在定性上分析了Eigencontours的特征。我们的代码可从https://github.com/dnjs3594/eigencontours获得。

translated by 谷歌翻译

Depth Map Decomposition for Monocular Depth Estimation

Jinyoung Jun , Jae-Han Lee , Chul Lee , Chang-Su Kim

分类：计算机视觉

2022-08-23

我们提出了一种新型算法，用于单眼深度估计，将度量深度图分解为归一化的深度图和尺度特征。所提出的网络由共享编码器和三个解码器组成，称为G-NET，N-NET和M-NET，它们分别估算了梯度图，归一化的深度图和度量深度图。M-NET学习使用G-NET和N-NET提取的相对深度特征更准确地估算度量深度。所提出的算法具有一个优点，即它可以使用无度量深度标签的数据集来提高度量深度估计的性能。各种数据集的实验结果表明，所提出的算法不仅为最先进的算法提供竞争性能，而且即使只有少量的度量深度数据可用于培训，也会产生可接受的结果。

translated by 谷歌翻译

DPICT: Deep Progressive Image Compression Using Trit-Planes

Jae-Han Lee , Seungmin Jeon , Kwang Pyo Choi , Youngo Park , Chang-Su Kim

分类：计算机视觉

2021-12-12

我们使用氚（DPICT）算法提出了深度渐进的图像压缩，该算法是支持细粒度可扩展性（FGS）的第一学习的编解码器。首先，我们使用分析网络将图像转换为潜在的张量。然后，我们代表三元数字中的潜在张量（氚），并通过氚平面将其以减少的意义顺序编码为压缩比特流慢平面。此外，在每个氚平面内，我们根据其速率失真优先级对速度进行排序，并首先传输更重要的信息。由于压缩网络对使用更少的氚平面的情况较少优化，因此我们开发了用于以低速率精炼重建图像的后处理网络。实验结果表明，DPICT显着优于传统的渐进式编解码器，同时实现FGS传输。

translated by 谷歌翻译

IceNet for Interactive Contrast Enhancement

Keunsoo Ko , Chang-Su Kim

分类：计算机视觉

2021-09-13

在该工作中提出了一种基于CNN的交互式对比增强算法，称为ICENET，其使用户能够根据他或她的偏好来调整图像对比度。具体地，用户提供了用于控制全局亮度和两种类型的涂鸦的参数，以使图像中的变暗或亮亮局部区域。然后，考虑到这些注释，ICENET估计用于像素 - 明智的伽马校正的伽马图。最后，通过颜色恢复，获得增强的图像。用户可以迭代地提供注释以获得令人满意的图像。ICENET还能够自动生产个性化增强型图像，这可以作为进一步调整的基础，如果需要。此外，为了有效且可靠地培训ICENET，我们提出了三种可分子化的损失。广泛的实验表明，ICENET可以为用户提供令人满意的增强图像。

translated by 谷歌翻译

Class-Continuous Conditional Generative Neural Radiance Field

Jiwook Kim , Minhyeok Lee

分类：计算机视觉 | 人工智能

2023-01-03

The 3D-aware image synthesis focuses on conserving spatial consistency besides generating high-resolution images with fine details. Recently, Neural Radiance Field (NeRF) has been introduced for synthesizing novel views with low computational cost and superior performance. While several works investigate a generative NeRF and show remarkable achievement, they cannot handle conditional and continuous feature manipulation in the generation procedure. In this work, we introduce a novel model, called Class-Continuous Conditional Generative NeRF ($\text{C}^{3}$G-NeRF), which can synthesize conditionally manipulated photorealistic 3D-consistent images by projecting conditional features to the generator and the discriminator. The proposed $\text{C}^{3}$G-NeRF is evaluated with three image datasets, AFHQ, CelebA, and Cars. As a result, our model shows strong 3D-consistency with fine details and smooth interpolation in conditional feature manipulation. For instance, $\text{C}^{3}$G-NeRF exhibits a Fr\'echet Inception Distance (FID) of 7.64 in 3D-aware face image synthesis with a $\text{128}^{2}$ resolution. Additionally, we provide FIDs of generated 3D-aware images of each class of the datasets as it is possible to synthesize class-conditional images with $\text{C}^{3}$G-NeRF.

translated by 谷歌翻译

A contrastive learning approach for individual re-identification in a wild fish population

Ørjan Langøy Olsen , Tonje Knutsen Sørdalen , Morten Goodwin , Ketil Malde , Kristian Muri Knausgård , Kim Tallaksen Halvorsen

分类：计算机视觉 | 人工智能 | 机器学习

2023-01-02

In both terrestrial and marine ecology, physical tagging is a frequently used method to study population dynamics and behavior. However, such tagging techniques are increasingly being replaced by individual re-identification using image analysis. This paper introduces a contrastive learning-based model for identifying individuals. The model uses the first parts of the Inception v3 network, supported by a projection head, and we use contrastive learning to find similar or dissimilar image pairs from a collection of uniform photographs. We apply this technique for corkwing wrasse, Symphodus melops, an ecologically and commercially important fish species. Photos are taken during repeated catches of the same individuals from a wild population, where the intervals between individual sightings might range from a few days to several years. Our model achieves a one-shot accuracy of 0.35, a 5-shot accuracy of 0.56, and a 100-shot accuracy of 0.88, on our dataset.

translated by 谷歌翻译

Learning to Maximize Mutual Information for Dynamic Feature Selection

Ian Covert , Wei Qiu , Mingyu Lu , Nayoon Kim , Nathan White , Su-In Lee

分类：机器学习 | (统计)机器学习

2023-01-02

Feature selection helps reduce data acquisition costs in ML, but the standard approach is to train models with static feature subsets. Here, we consider the dynamic feature selection (DFS) problem where a model sequentially queries features based on the presently available information. DFS is often addressed with reinforcement learning (RL), but we explore a simpler approach of greedily selecting features based on their conditional mutual information. This method is theoretically appealing but requires oracle access to the data distribution, so we develop a learning approach based on amortized optimization. The proposed method is shown to recover the greedy policy when trained to optimality and outperforms numerous existing feature selection methods in our experiments, thus validating it as a simple but powerful approach for this problem.

translated by 谷歌翻译

Design, Modeling, and Evaluation of Separable Tendon-Driven Robotic Manipulator with Long, Passive, Flexible Proximal Section

Christian DeBuys , Florin C. Ghesu , Jagadeesan Jayender , Reza Langari , Young-Ho Kim

分类：机器人

2023-01-01

The purpose of this work was to tackle practical issues which arise when using a tendon-driven robotic manipulator with a long, passive, flexible proximal section in medical applications. A separable robot which overcomes difficulties in actuation and sterilization is introduced, in which the body containing the electronics is reusable and the remainder is disposable. A control input which resolves the redundancy in the kinematics and a physical interpretation of this redundancy are provided. The effect of a static change in the proximal section angle on bending angle error was explored under four testing conditions for a sinusoidal input. Bending angle error increased for increasing proximal section angle for all testing conditions with an average error reduction of 41.48% for retension, 4.28% for hysteresis, and 52.35% for re-tension + hysteresis compensation relative to the baseline case. Two major sources of error in tracking the bending angle were identified: time delay from hysteresis and DC offset from the proximal section angle. Examination of these error sources revealed that the simple hysteresis compensation was most effective for removing time delay and re-tension compensation for removing DC offset, which was the primary source of increasing error. The re-tension compensation was also tested for dynamic changes in the proximal section and reduced error in the final configuration of the tip by 89.14% relative to the baseline case.

translated by 谷歌翻译

Situation-Aware Deep Reinforcement Learning for Autonomous Nonlinear Mobility Control in Cyber-Physical Loitering Munition Systems

Hyunsoo Lee , Soohyun Park , Won Joon Yun , Soyi Jung , Joongheon Kim

分类：机器人

2022-12-31

According to the rapid development of drone technologies, drones are widely used in many applications including military domains. In this paper, a novel situation-aware DRL- based autonomous nonlinear drone mobility control algorithm in cyber-physical loitering munition applications. On the battlefield, the design of DRL-based autonomous control algorithm is not straightforward because real-world data gathering is generally not available. Therefore, the approach in this paper is that cyber-physical virtual environment is constructed with Unity environment. Based on the virtual cyber-physical battlefield scenarios, a DRL-based automated nonlinear drone mobility control algorithm can be designed, evaluated, and visualized. Moreover, many obstacles exist which is harmful for linear trajectory control in real-world battlefield scenarios. Thus, our proposed autonomous nonlinear drone mobility control algorithm utilizes situation-aware components those are implemented with a Raycast function in Unity virtual scenarios. Based on the gathered situation-aware information, the drone can autonomously and nonlinearly adjust its trajectory during flight. Therefore, this approach is obviously beneficial for avoiding obstacles in obstacle-deployed battlefields. Our visualization-based performance evaluation shows that the proposed algorithm is superior from the other linear mobility control algorithms.

translated by 谷歌翻译

X-MAS: Extremely Large-Scale Multi-Modal Sensor Dataset for Outdoor Surveillance in Real Environments

DongKi Noh , Changki Sung , Teayoung Uhm , WooJu Lee , Hyungtae Lim , Jaeseok Choi , Kyuewang Lee , Dasol Hong , Daeho Um , Inseop Chung

分类：机器人

2022-12-30

In robotics and computer vision communities, extensive studies have been widely conducted regarding surveillance tasks, including human detection, tracking, and motion recognition with a camera. Additionally, deep learning algorithms are widely utilized in the aforementioned tasks as in other computer vision tasks. Existing public datasets are insufficient to develop learning-based methods that handle various surveillance for outdoor and extreme situations such as harsh weather and low illuminance conditions. Therefore, we introduce a new large-scale outdoor surveillance dataset named eXtremely large-scale Multi-modAl Sensor dataset (X-MAS) containing more than 500,000 image pairs and the first-person view data annotated by well-trained annotators. Moreover, a single pair contains multi-modal data (e.g. an IR image, an RGB image, a thermal image, a depth image, and a LiDAR scan). This is the first large-scale first-person view outdoor multi-modal dataset focusing on surveillance tasks to the best of our knowledge. We present an overview of the proposed dataset with statistics and present methods of exploiting our dataset with deep learning-based algorithms. The latest information on the dataset and our study are available at https://github.com/lge-robot-navi, and the dataset will be available for download through a server.

translated by 谷歌翻译